Murray-Rust Journal of Cheminformatics 201 1, 3:48
http://www.jcheminf.eom/content/3/1/48
Journal of
Cheminformatics
EDITORIAL Open Access
Semantic science and its communication -
a personal view
Peter Murray-Rust
Abstract
The articles in this special issue represent the culmination of about 15 years working with the potential of the web
to support chemical and related subjects. The selection of papers arises from a symposium held in January 201 1
(Visions of a Semantic Molecular Future') which gave me an opportunity to invite many people who shared the
same vision. I have asked them to contribute their papers and most have been able to do so. They cover a wide
range of content, approaches and styles and apart from the selection of the speakers (and hence the authors) I
have not exercised any control over the content.
Overview
The articles have a common theme of representing
information in a semantic manner - i.e. being largely
"understandable" by machine. This theme is common
across science and many of the articles can and should
be read by people outside the chemical sciences, includ-
ing information scientists, librarians, etc. An emergent
phenomenon of the last two decades is that information
systems can grow without top-down directions. This is
disruptive in that it empowers anyone with energy and
web-skills, and is most powerful when exercised in com-
munities of people with similar or complementary skills.
It is often possible to move very quickly, and in our
hackfests (one was prepended to the symposium) we
have shown that it is possible to prototype within a day
or two. This creates a new generation of scientist-hack-
ers (I use "hacker" as "A person who enjoys exploring
the details of programmable systems and stretching
their capabilities" [1]). Several of the authors in this
issue would regard themselves as "hackers" and enjoy
communicating through software and systems rather
than written English. This stretches the boundaries of
the possible but also creates tension where the main-
stream world cannot react on a hacker timescale and
with hacker ethics.
More generally many scientists and information pro-
fessionals are increasingly frustrated with the conven-
tional means of disseminating science. Most
Correspondence: pm286@cam.ac.uk
Unilever Centre for Molecular Science Informatics, Department of Chemistry,
Lensfield Road, Cambridge CB2 1EW, UK
© 201 1 Murray-Rust; licensee Chemistry Cer
Chemistry Central Commons Attribution ticense (http://creati\
reproduction in any medium, provided the
conventional publishers regard scientific articles as
"their content" and a very recent article (2011-06-20)
from the STM publishers [2] indicates that the publish-
ers believe they have the right to determine how content
is, or more often is not, used. As an example most for-
bid by default indexing, textmining, repurposing, even of
factual data to which the scientist has a legitimate sub-
scription. This has an entirely negative effect on infor-
mation-driven science, preventing even the development
of the technology.
Generally, therefore, there is a culture of bottom-up
change ("web democracy") which looks to the modern
web and examples of empowerment. (There are also
examples of disempowerment such as attacks on Net-
neutrality, walled gardens, information monopolies, ven-
dor lock-in, etc. and this contrast activates many in the
modern informatics world). There are several articles,
therefore, whose main theme is the access to Open
information.
Openness and the choice of BMC as publisher
I have been critical of many publishers for their stance
on closed information, and resolved that the issue
reporting the symposium had to be completely Open.
This is difficult in chemistry where there are almost no
"Open Access" journals (those where by default all arti-
cles are Open ("Gold")). The "Green" approach, where
articles may be posted free-as-in-beer but not free-as-in-
speech {e.g. CC-BY), is useless in science as it is impos-
sible to discover and harvest green articles. Hybrid jour-
nals (where articles may be made Open by publication
itral ftd. This is an Open Access article distributed under the terms of the Creative
'ecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and
original work is properly cited.
Murray-Rust Journal of Cheminformatics 201 1, 3:48
http://www.jcheminf.eom/content/3/1/48
Page 2 of 7
charges) are also of little value as the rights to the con-
tents are usually poorly labelled and a machine cannot
discover all "Open chemistry articles".
While writing this overview and several articles I have
become even more convinced that the only way of
creating full semantic science is to publish Openly (CC-
BY) and to publish completely (i.e. all experimental
information (CCO/PDDL)). I believe that most funders
now recognise this and are pushing, as hard as they can,
to create fully Openly published science. I think this has
to come, the question is how long it takes and in what
form.
I now believe that in many cases it is unethical to
restrict access to publicly funded science. Lessig, in his
CERN talk ("Scientific Knowledge Should Not Be
Reserved For Academic Elite" [3]), showed that it would
cost 500 USD for him to read the top 10 papers relating
to his child's condition. These papers are effectively only
available to academics in rich universities. A colleague
recently told me he had spent a month researching the
literature of his child's condition (to critically effective
purpose) and we agreed he could only do this because
he was a professor at a University. That is one reason I
support the Open Knowledge Foundation and its pro-
jects to define and obtain Open information (of which
Open Bibliography [4] in this issue is typical).
As part of this effort four of us (including authors in
this issue) developed the Panton Principles for Open
Scientific Data. These principles are simple and, we hope,
self-evidently worth pursuing and would lead to a greatly
increased substrate for the Scientific Semantic Web. We
were therefore delighted when BioMed Central not only
enthusiastically adopted the idea but took positive steps
to implement this as part of their publication process, for
example by labelling data items with the OKF's "OPEN
DATA" logo. This is valuable not only in making the
data repurposable, but also by promoting the concept -
many readers will now be familiar with the logo. BMC
have also encouraged authors (and editors) to highlight
outstanding examples of data publication (and done me
the honour of asking me to present their awards).
It is therefore a real pleasure to work with a publisher
who understands my, and my co-authors', intentions
and is prepared to work to make them happen. The arti-
cle explores many new types of publications and BMC
have undertaken, as far as technically possible, to imple-
ment them as examples of a new generation of publica-
tion technologies. I and others have been critical of PDF
as a publication format - it destroys semantics and inno-
vation, but we must "eat our own dogfood" [5] and this
is shown by several articles. Henry Rzepa creates all his
molecules as semantic objects, while in Open Bibliogra-
phy [4] we use our newly developed BibJSON and Scho-
larlyHTML to create and publish the article.
I am confident that because of the Openness, the
readership of these articles will be much larger than if
they were published in a closed access manner, however
apparent the prestige of the closed access publisher. It is
easy for a mature scientist, such as myself, to publish in
an Open Access journal as it is unlikely to affect my
career. I'd like to pay credit to all young people who
have decided to publish in OA journals despite the pos-
sible current (irrational) view that this is detrimental to
how they are regarded. I believe that their faith will be
justified and that in a very short time the work pub-
lished here will have higher visibility, and possibly
regard, than if it had been published in an apparently
more prestigious, closed access journal.
Open Data
Five years ago the term "Open Data" was unknown (I
started a Wikipedia page [6] to collect instances of
usage). Now it is ubiquitous. Most of the public funders
(Research Councils UK, Wellcome Trust, NIH, NSF and
other national bodies) are now requiring that research-
ers make their data Openly available.
The first challenge is cultural; researchers have to be
persuaded that Open Data is not only inevitable but also
beneficial to their activities. Even when an author is con-
vinced of the value of publishing Open Data, it is usually
not trivial to do so. Unlike a manuscript where a static,
human-readable, webpage can be posted and served for
all time, data are frequently much more complex. They
may be very large (petabytes), complex in both semantics
and organisation, and even distributed over several sites.
In bioscience, it is becoming commoner to see data pub-
lished as Excel and other spreadsheets but in chemistry
(apart from crystallography) the tradition is still to pub-
lish supplemental data as PDF, which destroys much of
its semantics. One simple and achievable goal of these
publications is to convince chemists that publishing in
semantic form is "almost" no effort, compared to the
effort of producing the data in the first place. If we were
able to persuade researchers in computational chemistry
simply to deposit their logfiles (usually less than 5 MB),
or the Word documents for their syntheses, machines
would be able to revolutionise the practice and under-
standing of computational and experimental chemistry.
Open Access (CC-BY) implies (but may not explicitly
state) that articles can be repurposed by machine extrac-
tion of data items, e.g. by OSCAR.
We have also addressed the question of what is Open
Data and how do we identify it, both to humans and to
machines. For many chemists, this may be the first time
that they have had to consider this problem, but it is
becoming increasingly required in many fields and for
that reason, we have in several papers, discussed the
question of licenses and contracts.
Murray-Rust Journal of Cheminformatics 201 1, 3:48
http://www.jcheminf.eom/content/3/1/48
Page 3 of 7
The semantic vision
I was excited and entranced by chemical informatics in
the mid-70s as a result of some of the ground-breaking
work done between chemists and computer scientists.
The visions of LHASA, CONGEN, DENDRAL and
others opened up the prospect of a chemical world
where machines were seen as valuable allies of humans.
This vision was also held in the world of chess, and
indeed many chemical informatics processes are similar
to the operations required in 'artificial intelligence'.
Chess has succeeded. Machines can now beat any
human on the planet. For whatever reasons, chemistry
turned its back on AI and there have been few develop-
ments in the last three decades. A necessary condition is
the Open availability of semantic data, and if this comes
about then there will be a major discontinuity in the
way we practice chemistry.
In 1994, Henry Rzepa and I attended the first WWW
conference in CERN. It was a remarkable occasion
where a number of very early adopters showed what
was possible with web technology and gave a vision of
how this would change the way that science was not
only reported but also done. There was a feeling that
we were entering a new frontier where anything was
possible and where new rules would evolve to fit the
vision of cyberspace. The final session, where Tim Ber-
ners-Lee showed how semantic operations altered the
real world was one of the seminal events of my last 20
years.
Semantic reality
Not surprisingly, semantic progress has turned out very
differently from our original visions. We have stuck to
our view that science must adopt semantic technologies
including both the formal description of objects and the
links between them. Chemistry has been very slow to
adopt this, but other subjects have been much more
adventurous and in bio- and geo-sciences it is routine to
create objects which are derived from, and linked to,
other objects.
Many of the problems are cultural and for that reason
several of the papers in this issue address the need to
change attitudes as much as the technical requirements
for the electronic infrastructure. I believe that it is
impossible to do modern science unless the key infor-
mation is completely Open. This applies, for example, to
identifier systems, bibliographic data and much factual
data. Chemistry, unfortunately in my opinion, has a
strong ingrained culture of possession and sale/licensing
of data. For this reason, it is often behind other subjects
and, in the recent SOAP report [7] chemistry was high-
lighted as several years behind bioscience in its
approach to Openness.
For that reason, some of the things we report are pro-
totypes rather than completely established semantic
resources. The biosciences have convinced funders that
it is valuable to have completely Open access to
sequences, structures, ontologies, etc. In chemistry, most
of the freely accessible material has been produced by
enthusiasts rather than large funded organisations.
Indeed, it is the availability of bioscience resources such
as ChEBI which to some extent drive the adoption of
Open chemical semantics.
It is also an opportunity for our group to summarise
formally several of the projects that they have been
working on for several years. It is a feature of informa-
tion projects that there is often no clear point at which
a formal publication is immediately relevant and indeed
this highlights the disconnect between publishing neces-
sary information and publishing to acquire a community
seal of approval ('a publication').
Chemistry as a community
Many disciplines have a close sense of community (I
highlight crystallography which has a real sense of com-
munal practice and goals). Many of the ideas in these
articles have been inspired by crystallographic practice,
its outstanding scientists, and its International Union -
probably the leader in driving semantic approaches.
Scientific communities are now common on the web
(and even have commercial value) and several of the
articles emphasise the role of ad hoc and other commu-
nities. The web has the great advantage that anyone can,
relatively easily, find those people and organizations who
share values and goals, amplifying minority or early-
adopter initiatives. Their dynamics are unpredictable
and most die, but enough survive to provide world-
changing mechanisms.
There is no clear community focus for chemistry over-
all (though sub sections - such as WATOC (World
Association of Theoretical Organic Chemists) may pro-
vide one). The main drivers (funding, advancement,
commerce) have always been present but the modern
era has amplified and often dehumanised them. With
growing emphasis on publication to generate the income
of learned societies there is a decreasing sense that they
act as nuclei for community to grow communal goods.
Because of this, chemistry has almost no public ontol-
ogies, and we have a vicious circle. Without ontologies,
authors cannot reasonably be expected to create seman-
tic information, and without a clear need for semantic
information, the community will not take on the consid-
erable load of creating ontologies. Several of the articles
argue that the creation of lightweight dictionaries and
other semantic metadata is affordable by the community
and I believe that if the communal will is present, then
Murray-Rust Journal of Cheminformatics 201 1, 3:48
http://www.jcheminf.eom/content/3/1/48
Page 4 of 7
it would be possible through bodies such as IUPAC and
others, to create a full semantic infrastructure for much
of the current published chemistry.
The current legal and contractual restrictions on re-
using chemical data are seriously holding chemistry
behind other subjects. These articles in this issue are
not the place for polemics but we hope that traditional
creators of information resources in chemistry will now
think carefully about the value of making their data fully
Openly available. This will be a considerable act of faith,
because it will need a change in business model. Some
of those providers have been traditionally held in high
esteem by the community and if they use that esteem
they have the opportunity to change the practice of che-
mical informatics.
The value of informatics
A major feature underlying all of the papers is to give
an insight into the process of creating an information
ecology. Some of them represent scientific discoveries
(e.g. Rzepa) but most are concerned with building a
coherent infrastructure usable by the community. It may
be useful to liken this infrastructure to the development
of instrumentation in many branches of science. Science
depended on the microscope, the telescope, the spectro-
graph, the Geiger counter and many other types of
instrumentation. There is sometimes a modern tendency
to discount instrumentation and infrastructure as not
being 'proper science'. We hope that this issue will
redress that balance.
As an analogy, Mendeleev required access to other
scientists' work to produce his classification, as did Paul-
ing, Woodward and Hoffmann. I believe that the current
chemical and related literature contains considerable
amounts of undiscovered science, and that with 'infor-
mation telescopes' we can start to discover this.
The development of infrastructure is a lengthy pro-
cess. The web has, perhaps, given us an optimistic idea
of the speed at which new ways of working can be
implemented. We are still often governed by Planck's
observation ("Science progresses one funeral at a time")
and this is equally true for some areas of informatics.
Several of the articles reflect the difficulty of catalysing
change in what is essentially a mature and therefore
conservative discipline.
Henry Rzepa and I were active contributors to the
development of XML by running the XML-DEV mailing
list (1997). This was a highly successful Open example
of true collaboration and for me it culminated in the
development of the SAX protocol late that year. XML
had been seen as a primarily document- plus typeset-
ting-oriented discipline, but some of us realised its
potential for data modelling and transfer, and therefore
the need for APIs in XML tools. I nagged continually at
the community, and, as a result, Tim Bray, David Meg-
ginson and others helped us to develop the SAX proto-
col, now implemented in every computer on the planet.
This protocol was developed in a calendar month and
has stood the test of time exceedingly well.
This, perhaps, gave Henry, myself and other early
adopters a false vision of how rapidly we would be able
to take these new ideas to chemistry. Over the decade
2000-2010, we have developed and published specifica-
tions and software which we believe represent a formal
but implementable infrastructure for chemical infor-
matics. The uptake of these has been slow, but unlike
some new technologies has not gone through the hype
and depression syndrome (Gartner curve). In fact, this
timescale is not so unusual. HTML itself has been
through nearly 20 years of deployment and only now,
with HTML5, does it appear that the community is
starting to work together rather than fracturing for
organisational and personal advantage. Similarly, seman-
tic MathML is taking many years to become established.
It is not that these systems, including CML, have been
supplanted by 'better' ways of doing things, but more
that the community as a whole is yet to be enlightened
about the value of semantics.
Publishing
Scientific publishing should be a key part of the seman-
tic revolution, but it has so far completely failed to
address the vision. This is ironic in that HTML, which
catalysed the web, was developed as a way for scientists
at CERN to share information, but we have currently
regressed to a completely non-semantic (PDF) manner
of communication. This has replicated the traditional
paper format so well that the only discernable value is
to transfer the printing bill from the publishers to the
readers. Not only has this held back our imagination,
but has actually moulded the new, and I think some-
what unfortunate, values in the publication process. In
many cases, authors now publish primarily to attain
numerical estimates of worth above communication,
validating experiments and other fundamental aspects of
the process.
The web can, and, we hope, will, change this. Where
you publish should not matter so long as the material is
discoverable and the process of reviewing is understood.
I believe that the papers in this issue will be read well
beyond the cheminformatics community, because their
value will be discerned and communicated by methods
supplementary to the formal publishing process.
A major challenge in this issue is that the timescales
for many of the projects is complex. In many lab experi-
ments (such as chemical synthesis or chemical crystallo-
graphy) the process is clearly bounded, "make this
compound", "check success through crystal structure
Murray-Rust Journal of Cheminformatics 201 1, 3:48
http://www.jcheminf.eom/content/3/1/48
Page 5 of 7
analysis". Each (normally) has a clear endpoint and can
be published as a static document.
In contrast how should we publish software? We use
public repositories and these contain a complete record
and the current semantic object. If we wish to tell the
world about a development we put it on the mailing list.
There is no need for a formal publication for those
aspects. The motivation is therefore primarily to estab-
lish our reputation and there is no simple way to decide
when this should be done. JUMBO has had six revisions
- should this result in six papers or one or none? (Actu-
ally the only JUMBO paper is in 1997 [8]). Six papers
would confuse - but after 14 active years it's time for
another, I think, which explains the design process.
OSCAR3 has its citable publication - a few years back -
and we feel it's useful to publish our current ideas,
which have more to do with software engineering than
new chemical entity recognition.
Or data? Crystaleye was a spinoff from Nick Day's the-
sis - it wasn't planned as a separate project - but simply a
knowledgebase to use for his calculations. It does not
have a formal publication other an archive of a presenta-
tion [9]. The system has been running 5 years without
serious mishap but the lack of a formal publication
makes it difficult to write papers which refer to it. So we
shall do this - after the fact. But if we had a semantic
publication process it would be "published" by now.
The need to change publication processes
Historically the scientific community has required the
following from the publication process:
♦ Establishment or priority and authorship
♦ Exposure and preservation of the scientific record
♦ Communicating the science to one's peers and the
wider world
♦ Allowing the science to be moderated by peers and
others ("reviewing").
There is perhaps an additional axis in today's biblio-
metric-obsessed world: allowing the work to receive an
official assessment of merit.
However the publication process is out of sync with
the modern web-based world ("Web 2.0") which allows
the publication process to encourage and support:
♦ Collaborative working (as seen in many projects
such as Wikipedia, Open StreetMap, and in science,
Galaxy Zoo). Here each contribution is often an
atom in a much larger cloud and the publication
process is continuous rather than discrete. Wikipedia
articles are "never finished" though there are some
efforts to provide frozen versions. This is a strong
theme of this "issue".
♦ Independence of the source of publication.
Given the ability of search engines, and the social
networks, to discover anything of value it matters
less where something is published. Other than the
choice of reviewers the primary issues is whether a
piece of information is accessible or limited. History
has shown that high quality scholarship on the web
will usually surface regardless of where it is
published.
♦ Creation of continuous semantic objects. By
recording everything we do, annotating it, and revis-
ing it, we can maintain a current semantic publica-
tion object at all times, including a revisitable
history. This should be the object of scientific publi-
cation, not the current PDF.
♦ The paper (semantic object) as a driver of
research. The idea of writing a paper before the
research is carried out is valuable and not novel (e.g.
George M. Whitesides [10,11]). Here, however, we
extend the paper to semantic objects (programs,
spreadsheets, molecules, bibliography, etc.).
Several of the papers in the article have adopted these
later ideas. This has been most obvious in Open Biblio-
graphy [4] where effectively the whole concept and tech-
nology has been driven during the 6 weeks of "writing
the paper". We started with a blank page and four peo-
ple (William Waites, Mark MacGillivray, Ben O'Steen,
Peter Murray-Rust) and during the writing process
brought in new authors (Jim Pitman, Peter Sefton,
Richard Jones) and communally created the design,
technology and "paper". The introduction of Scholarly
HTML made this paper self-referential. The Quixote
paper [12] has also dramatically driven the design of
Quixote, particularly the social aspects.
The content of the issue
Several of the articles (CML [13], OSCAR [14], OPSIN,
dictionaries [15], WWMM [16]) in this issue cover a
decade of work. We hope this will be useful to scientists
and scholars who wish to implement new ideas and to
give them some idea of what works, and what, more
commonly, does not work. Sometimes only the passage
of time and persistence achieves some level of success.
Again, the short-termism of many infrastructural pro-
jects militates against developing a good platform for
the future.
The long timescales highlight the difficulty of conven-
tional publication. The world knows of these projects
through blogs, online resources, user communities and
so on, and a conventional learned paper has little value
in communicating or preserving. Its prime merit is to
achieve a traditional numeric merit for the work, often
delayed by several years through the citation
Murray-Rust Journal of Cheminformatics 201 1, 3:48
http://www.jcheminf.eom/content/3/1/48
Page 6 of 7
mechanism. I believe that it is important to change the
values that we use in our assessment of on-going scien-
tific endeavours, and avoid ritual publication.
Some of the articles (Wilbanks [17], Neylon [18]) dis-
cuss the philosophy and practice of new models of
scientific endeavour and communications. Some of the
articles have a retrospective look (CML [13], Zaharevitz
[19]) but the fundamental principles are still as impor-
tant today as when the work was started. A number
represent growing points whose development is highly
unpredictable. These include the WWMM [16], where
the vision of a distributed peer-to-peer knowledge
resource has had to wait a decade until it could be
implemented. The Quixote project is only months old
but takes this vision and has already built an impressive
prototype, which I expect to set the model for computa-
tionally-based knowledge repositories. These projects
rely heavily on community, and this is most clearly
shown in the Blue Obelisk movement [20] which aims
to, and has largely succeeded in, creating an Open infra-
structure for cheminformatics. A major motivation for
this has been not just that software and data should be
universally available but also that this is the only man-
ner in which science can be reputably validated both by
humans and machines. An example of the need for such
validation is shown in Henry Rzepa's article [21].
The OpenBibliography project represents a socio-poli-
tical imperative whose time has come, and for which the
technology is appropriate. A year ago the JlSC-funded
OpenBibliography project could not point to a signifi-
cant amount of open resources, but in the last year we
have helped to catalyse the release of both library data
(BL, CUL and several others), and also of scientific bib-
liography. It is impossible to find Open resources for
scientific bibliography but we believe that in a year's
time, readers can look back and see this as a key start-
ing point. It is worth noting that the very process of
writing this article has generated a great deal of new
formalism and tools in Open bibliography, and effec-
tively given major impetus to the BibJSON approach.
Other articles (OSCAR, Open patents [22], diction-
aries, CML and CMLLite [23]) describe the design and
implementation of information systems. In general,
there is little funding for developing scientific software,
though we have been fortunate to receive some from
eScience and from JISC We have taken this responsibil-
ity very seriously and our group has installed many of
the cutting-edge ideas and tools for building high-qual-
ity systems. Members of the group collaborate and use
common servers for their work (as far as possible on
Open sites). Software libraries are used and re-used
between group members, and we have developed a cul-
ture of communal ownership and responsibility. By
using the continuous integration system (Jenkins), a
failure in one library can immediately be highlighted
and corrected before it impacts on other projects.
Where funding is available, and where the culture allows
it, we would very strongly recommend these practices in
other groups. Again, many of these systems have taken
over a decade to evolve from initial concepts to mature
libraries, but we believe that almost all the systems
reported in this article have been heavily re-factored
and, within the academic environment, represent an
attainable level of quality.
The future
Several articles are growing points, perhaps none more
than AMI [24] where we explore the human-cyber inter-
face in a laboratory, a "memex" which may ultimately
replace some (but hopefully not all) of the role of the
chemistry laboratory. In the same way Quixote repre-
sents a memex for computational chemistry. There is no
clear pathway for AMI (and I predict that this will be
largely influenced by what happens in the domestic
arena).
The relative stagnation of chemical informatics sug-
gests that change is unlikely to happen from within
chemistry. As progress occurs in other areas (retail,
bioscience etc.) chemistry may be dragged into the
semantic world regardless. If chemists wish to retain
control over their own systems they will be wise to
start investing in Open semantic environments,
because otherwise the rest of the world will do it for
them.
How can chemical informatics survive and prosper? I
think the most likely model will be Open publishing,
not just of texts but data and other resources, mandated
and paid for by funders. Those publishers which are
able to adopt an Open model rather than continuing to
maintain their own walled gardens, will ultimately tri-
umph, and probably more rapidly than we expect.
Received: 29 June 2011 Accepted: 14 October 2011
Published: 14 October 2011
References
1. Wikipedia: Hacking (innovation). [http://en.wikipedia.org/wiki/Hacker_%
28programmer_subculture%29].
2. Smit E, van der Graaf M: Journal Article Mining: A Research study into
Practices, Policies, Plans... and Promises. Publishers Research Council 2011,
153.
3. Intellectual Property Watch: Lessig At CERN: Scientific Knowledge
Should Not Be Reserved For Academic Elite, [http://www.ip-watch.org/
weblog/201 1 /04/1 9/lessig-at-cem-scientific-knowledge-should-not-be-
reserved-for-academic-elite/].
4. Jones R, (VlacGillivray M, Murray-Rust P, Pitman J, Sefton P, O'Steen B,
Waites W: Open Bibliography for Science, Technology, and Medicine. J
Cheminform 2011, 3:47.
5. Wikipedia: Eating your own dog food, [http://en.wikipedia.org/wiki/
Eating_your_own_dog_food].
6. Wikipedia: Open science data, [http://en.wikipedia.org/wiki/
0pen_science_data].
Murray-Rust Journal of Cheminformatics 201 1, 3:48
http://www.jcheminf.eom/content/3/1/48
Page 7 of 7
7. Study of Open Access Publishing: Report from the SOAP Symposium.
[http://project-soap.eu/report-from-the-soap-symposium/].
8. Murray-Rust P: JUMBO: An Object-based XML Browser. World Wide Web
Journal 1997, 2(4):1 97-206.
9. Downing OJ, Day Nicholas E, Murray-Rust P: CrystalEye - From Desktop to
Data Repository.[http://www.dspace.cam.ac.uk/handle/1 81 0/1 961 86].
10. Wikipedia: George M. Whitesides. [http://en.wikipedia.org/wiki/George_M.
_Whitesides].
1 1. Whitesides GM: Whitesides' Group: Writing a Paper. Advanced Materials
2004, 16:1375-1377.
12. Adams SE, de Castro P, Echenique P, Estrada J, Hanwell MD, Murray-Rust P,
Sherwood P, Thomas J, Townsend J: The Quixote project: Collaborative
and Open Quantum Chemistry data management in the Internet age. J
Cheminform 201 1, 3:38.
13. Murray-Rust P, Rzepa HS: CML: Evolution and Design. J Cheminform 201 1,
3:44.
14. Jessop DM, Adams SE, Willighagen EL, Hawizy L, Murray-Rust P: OSCAR4: a
flexible architecture for chemical text-mining. J Cheminform 201 1 , 3:41 .
15. Murray-Rust P, Townsend J, Adams SE, Phadungsukanan W, Thomas J: The
semantics of Chemical Markup Language (CML): dictionaries and
conventions. J Cheminform 201 1, 3:43.
16. Murray-Rust P, Adams SE, Downing J, Townsend J, Zhang YY: The semantic
architecture of the World-Wide Molecular Matrix (WWMM). J Cheminform
2011, 3:42.
17. Wilbanks J: Openness as Infrastructure. J Cheminform 201 1, 3:36.
18. Neylon C: Three stories about the conduct of science: Past, future, and
present. J Cheminform 201 1, 3:35.
19. Zaharevitz DW: Adventures in Public Data. J Cheminform 201 1, 3:34.
20. O'Boyle N, Guha R, Willighagen EL, Adams SE, Alvarsson J, Bradley JC,
Filippov IV, Hanson RM, Hanwell MD, Hutchison GR, James CA, Jeliazkova N,
Lang ASID, Langner KM, Lonie DC, Lowe DM, Pansanel J, Pavlov D,
Spjuth O, Steinbeck C, Tenderholt AL, Theisen KJ, Murray-Rust P: Open
Data, Open Source and Open Standards in chemistry: The Blue Obelisk
five years on. J Cheminform 201 1, 3:37.
21. Rzepa HS: The past, present and future of Scientific discourse. J
Cheminform 201 1, 3:46.
22. Jessop DM, Adams SE, Murray-Rust P: Mining chemical information from
Open patent. J Cheminform 201 1, 3:40.
23. Townsend J, Murray-Rust P: CMLLite: a design philosophy for CML. J
Cheminform 201 1, 3:39.
24. Brooks BJ, Thorn AL, Smith M, Matthews P, Chen S, O'Steen B, Adams SE,
Townsend JA, Murray-Rust P: Ami - The Chemist's Amanuensis. J
Cheminform 201 1, 3:45.
doi: 1 0. 1 1 86/1 758-2946-3-48
Cite this article as: Murray-Rust: Semantic science and its
communication - a personal view. Journal of Cheminformatics 201 1 3:48.
Publish with ChemistryCentral and every
scientist can read your work free of charge
"Open access provides opportunities to our
colleagues in other parts of the globe, by allowing
anyone to view the content free of charge. "
W. Jeffery Hurst, The Hershey Company.
■ available free of charge to the entire scientific community
• peer reviewed and published immediately upon acceptance
• cited in PubMed and archived on PubMed Central
• yours — you keep the copyright
Submit your manuscript here:
http://www.chemistrycentral.com/manuscript/ Chemistry C6tT tra I